Algorithms To Live By

Introduction

Algorithms To Live By explores the parallelism between computer science and real life. For a long time we have formed algorithms based on the real world and the methods learnt long ago. However, more and more frequently now, we are finding newly formed algorithms applicable to our real world problems, as if the pattern has been reversed.

Our first example is optimal stopping, which gives a solution to a style of problem many of us may be familiar with – knowing when to stop. A good analogy of the problem is the Secretary Problem: You are chosen to pick the new secretary. There are one hundred applicants in a random order and your goal is to pick the best one. However there is a catch. If you believe the applicant you are currently interviewing is the best one, then you can commit to them. If you do this, you cannot interview the remaining candidates. If you do not hire the applicant, they will move on and you cannot call them back. This type of problem can be applied in many examples, for example when selling a house, which buyer should you commit to? Or even when should you commit to a partner when searching for love? The problem also assumes that you have no prior knowledge of what a good secretary looks like, or the best price for the house. You do not want to commit too early, since you need to learn what makes a good secretary from the first interviewees. However, you do not want to commit too late since you risk having missed the best offer. Computer science and maths gives us a very simple answer to this ‘look then leap’ type of problem: 37%. This means that the optimal solution is to spend 37% of your time on the applicants learning about what makes a good secretary or offer – this could simply be the best one you come across. Then in the remaining 63% of applicants or offers, you should be prepared to commit to the next best applicant or offer. Curiously, using this method gives you a 37% chance of finding the best applicant – so even the optimal strategy will more often than not fail. This is an interesting symmetry in the problem.

Authors Brian Christian and Tom Griffiths later explore the concept of caching. Caching investigates the famous computer science trade off between speed and capacity. Quicker memory locations will usually have a lower capacity and vice versa (higher capacity memory locations will have lower speeds). As a result, a hierarchy of memory levels is used to increase computer performance. This is caching. Caching also utilises the geography of a computer – memory locations closer to the processor will be the quickest, so this is where the fastest memory locations are placed. Instructions and data most frequently and recently used will be stored in this closer and quicker memory location. This introduces interesting problems on what instructions and data should be stored in each location in the hierarchy, which we will discuss shortly. Interestingly, not only computers use this approach. The Internet uses a similar approach where ‘content distribution networks’ (CDNs) use a network of servers and computers to distribute frequently requested webpages in that specific areas. This follows a similar geographical structure to the computer’s memory layout. The result is much quicker loading times for frequently requested webpages in a specific area. Moreover, this method is also used outside the computer science domain, with Amazon using fulfilment warehouses and ‘anticipatory package shipping’ to maximise the speed certain products reach certain consumers in certain areas.

As we mentioned previously, what each cache level stores at a given moment poses interesting dilemmas. What algorithm should be used to decide to decide the contents of a memory location in a hierarchy? The algorithm which we really want is what computer scientists call a ‘clairvoyant’ algorithm – an all-knowing algorithm which can predict future requests and plan the cache levels accordingly. Unfortunately this is not a realistic solution and computer scientists have struggled to produce it. Other strategies might be a ‘random eviction strategy’ which replaces instructions randomly or a ‘FIFO’ algorithm which follows a first in first out principle. Again though, these are not sufficient solutions. The best solution is actually a variant of an ‘LRU’ or ‘least recently used’ algorithm. This algorithm will discard the item which has been there the longest, unused.

Humans use caching in their every day lives. For example, when you go to do work, you load all the relevant information from files, book shelves and libraries onto your desk and return them when you are finished. This idea led to Yukio Noguchi’s development of ‘Noguchi filing.’ The economist at the University of Tokyo wanted to find the most efficient way to store his files so that he could quickly load and unload work onto his desk. Knowing what, where and when to store his files is a similar dilemma which we explained earlier with the caching levels. In the end, ‘Noguchi filing’ turned out to be a variant of the LRU algorithm. When Noguchi was done with a file, he simply put it at the start of a filing cabinet drawer. If there was no room, he would demote the last file in the drawer to the next cabinet drawer. This meant that all of Noguchi’s recently used files were very easy to access as they were near the front of the drawer. To search through them, he could simply recall when he last accessed them. His most recently used files are the files he will most likely use again soon and these are incredibly quick to find as they are very close to the start. As it turns out, Noguchi filing is actually an optimal solution. This means that the random and messy pile of papers sitting on somebody’s desk is actually optimally organised. Again we are able to see the similarities between computer science and real life and how we can use algorithms in our computers to prove a certain method is best for real world scenarios.

A final topic I found particularly interesting was using randomness in algorithms and in every day life. Some problems are unrealistic to solve by computing all possible combinations – for example, problems which require a shuffled deck of cards. These types of problems are called intractable problems since they can be solved, but in an unrealistic amount of time. For such problems, mathematicians and computer scientists Stanislaw Ulam, John von Neumann and Nicholas Metropolis devised a strategy to handle them. This was known as the Monte Carlo Method. Instead of exhaustingly calculating the probabilities to solve intractable problems, the Monte Carlo method uses random inputs to create many sample results. In doing so, you can simulate and approximate a problem to find a feasible solution to it. This solution is not necessarily the best one, however, it is the most appropriate one to deal with the result-time trade off which these problems offer. An example of where this is used is in nuclear physics. To calculate all the probabilities of a nuclear branching process would be nearly impossible since there are so many possibilities. However simulating the reaction through random samples can provide a much better alternative way to look at the problem.

Another massive implementation of randomness in algorithms is in Michael Rabin’s prime number algorithm. Prime numbers, for a long time, were regarded as ‘one of the most useless branches’ of mathematics. However, later it became a massive part of computer science cryptography. Two prime numbers can be multiplied together in milliseconds. However, factorising a result back into two primes can take literally millions of years to calculate. This is known as a ‘one way function.’ In cryptography, secret primes known only by the sender and the recipient can be multiplied together to create a massive number which is very difficult to convert back into the secret primes (the keys/plaintext). This means data can be transmitted securely. This method created a growing demand for algorithms which could check for primes. Such algorithms existed, however again, they would not be able to check in a reasonable amount of time.

As Michael Rabin was attempting to solve this problem, he ran into a graduate from Berkeley, Gary Miller. Miller had created an algorithm which could successfully test for primality in a feasible amount of time as part of his PhD thesis. The algorithm used a set of equations in terms of n (the number to be tested as prime) and x (a ‘witness’ against primality). However the problem with the algorithm is that it would sometimes give false positives. That is, it would return true (a prime) for numbers which were not prime. Rabin was able to overcome this problem using our idea of randomness in algorithms. Rabin proved that the algorithm had a probability of showing a false positive one quarter of the time. However by testing again with another value of x, the ‘witness’, the likelihood of a false positive drops again by a factor of four to one in sixteen. This means that the more times you test it and return true (for prime) with random values of x, the less likely the result is to be a false positive and the more likely it is to be prime. Test just fifteen times and the probability of a false positive is one in one billion. Modern cryptography systems test forty times, which gives a false positive once in a million billion billions – near the number of grains of sand in the world. So whilst you cannot be exact with this method, you can be incredibly certain which is enough for our security. In my opinion, I love how computer science is finding uses for mathematics and knowledge which was otherwise regarded as useless.

Conclusion

I have really enjoyed this book and it has certainly opened my eyes to the deeper applications of computer science in our world – the applications you would never expect. Computer science is teaching us about so much more than just computers. It teaches us how to think and how to solve problems – both mathematical and abstract in the real world. The book has also introduced me to some of the most interesting problems which computer scientists face on a regular basis such as overfitting. I am very excited to explore them further.

Algorithms To Live By Book Cover

Co Author - Brian Christian

Co Author - Tom Griffiths